Logistic distribution — the “sigmoid” law on ℝ#
The logistic distribution is a symmetric, bell-shaped continuous distribution on the real line whose CDF is the logistic (sigmoid) function. It is closely tied to log-odds (logit) transformations, to logistic regression (as an error model), and it provides a simple, heavier-tailed alternative to the normal distribution.
What you’ll learn#
how the PDF/CDF/quantile relate to the sigmoid and logit
closed-form moments (mean/variance/skewness/kurtosis), MGF/CF, and entropy
parameter interpretation (location \(\mu\), scale \(s\)) and how shape changes
NumPy-only sampling via inverse transform + Monte Carlo validation
practical usage via
scipy.stats.logistic(pdf,cdf,rvs,fit)
import platform
import numpy as np
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots
import scipy
from scipy import optimize, stats
from scipy.stats import chi2, logistic, norm
# Plotly rendering (CKC convention)
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
# Reproducibility
rng = np.random.default_rng(7)
np.set_printoptions(precision=4, suppress=True)
print("Python", platform.python_version())
print("NumPy", np.__version__)
print("SciPy", scipy.__version__)
Python 3.12.9
NumPy 1.26.2
SciPy 1.15.0
1) Title & Classification#
Name:
logisticType: continuous distribution
Support: \(x \in (-\infty, \infty)\)
Parameter space: location \(\mu \in \mathbb{R}\) and scale \(s > 0\)
We write:
The standard logistic is \(\mathrm{Logistic}(0,1)\).
SciPy uses the same location/scale form:
stats.logistic(loc=mu, scale=s).
2) Intuition & Motivation#
2.1 What it models#
The logistic distribution is a good model for real-valued noise that is:
symmetric (centered around \(\mu\))
unimodal (single peak at \(\mu\))
heavier-tailed than a normal (but still exponentially decaying)
A practical intuition: compared to a normal distribution with the same variance, logistic puts more probability mass in the tails.
2.2 Typical real-world use cases#
Latent-variable view of logistic regression: if a latent score is perturbed by logistic noise and thresholded, the resulting class probability is a sigmoid.
Log-odds modeling: if \(P \in (0,1)\) is a random probability, then \(\log\!\left(\frac{P}{1-P}\right)\) lives on \(\mathbb{R}\); logistic is a natural simple choice for such log-odds.
Convenient alternative to a normal: similar bell shape, simple CDF/quantile.
Mixture models / generative models: mixtures of logistics are used to model complex continuous densities (notably in some neural image models).
2.3 Relations to other distributions#
Uniform ↔ logistic (logit link): if \(U\sim\mathrm{Unif}(0,1)\), then $\(\log\!\left(\frac{U}{1-U}\right) \sim \mathrm{Logistic}(0,1).\)\( Conversely, if \)X\sim\mathrm{Logistic}(\mu,s)\( then \)F(X)\( is Uniform\)(0,1)$.
Gumbel difference: if \(G_1, G_2\) are i.i.d. Gumbel with the same scale, then \(G_1 - G_2\) is logistic.
Normal approximation: matching variances gives $\(\mathrm{Logistic}(0, s) \approx \mathcal{N}(0,1) \quad\text{when}\quad s=\sqrt{3}/\pi\approx 0.5513.\)$
Log-logistic: if \(X\sim\mathrm{Logistic}(\mu,s)\), then \(\exp(X)\) is log-logistic.
3) Formal Definition#
Let
3.1 PDF#
Different equivalent forms are useful:
where \(\sigma(z)=\frac{1}{1+e^{-z}}\).
3.2 CDF#
3.3 Quantile function (inverse CDF)#
For \(p\in(0,1)\):
This closed-form inverse CDF makes inverse transform sampling especially simple.
def sigmoid(z: np.ndarray) -> np.ndarray:
# Stable logistic function σ(z) = 1 / (1 + exp(-z)).
z = np.asarray(z, dtype=float)
out = np.empty_like(z)
pos = z >= 0
out[pos] = 1.0 / (1.0 + np.exp(-z[pos]))
ez = np.exp(z[~pos])
out[~pos] = ez / (1.0 + ez)
return out
def logistic_cdf(x: np.ndarray, mu: float = 0.0, s: float = 1.0) -> np.ndarray:
if s <= 0:
raise ValueError("scale s must be > 0")
z = (np.asarray(x, dtype=float) - mu) / s
return sigmoid(z)
def logistic_pdf(x: np.ndarray, mu: float = 0.0, s: float = 1.0) -> np.ndarray:
if s <= 0:
raise ValueError("scale s must be > 0")
z = (np.asarray(x, dtype=float) - mu) / s
p = sigmoid(z)
return (p * (1.0 - p)) / s
def logistic_logpdf(x: np.ndarray, mu: float = 0.0, s: float = 1.0) -> np.ndarray:
# Stable log-PDF using logaddexp:
# log f(x) = -log s - z - 2 log(1 + exp(-z)), where z=(x-mu)/s.
if s <= 0:
raise ValueError("scale s must be > 0")
z = (np.asarray(x, dtype=float) - mu) / s
return -np.log(s) - z - 2.0 * np.logaddexp(0.0, -z)
def logistic_ppf(p: np.ndarray, mu: float = 0.0, s: float = 1.0, eps: float = 1e-12) -> np.ndarray:
if s <= 0:
raise ValueError("scale s must be > 0")
p = np.asarray(p, dtype=float)
p = np.clip(p, eps, 1.0 - eps)
return mu + s * (np.log(p) - np.log1p(-p))
def logistic_rvs(
rng: np.random.Generator,
size: int | tuple[int, ...],
mu: float = 0.0,
s: float = 1.0,
) -> np.ndarray:
# NumPy-only sampling via inverse CDF.
u = rng.random(size=size)
return logistic_ppf(u, mu=mu, s=s)
def logistic_moments(mu: float = 0.0, s: float = 1.0) -> dict:
if s <= 0:
raise ValueError("scale s must be > 0")
mean = mu
var = (np.pi * s) ** 2 / 3.0
return {
"mean": mean,
"variance": var,
"skewness": 0.0,
"kurtosis": 4.2, # non-excess
"excess_kurtosis": 6.0 / 5.0,
"median": mu,
"mode": mu,
}
def logistic_entropy(s: float = 1.0) -> float:
if s <= 0:
raise ValueError("scale s must be > 0")
return float(np.log(s) + 2.0)
def logistic_mgf(t: np.ndarray, mu: float = 0.0, s: float = 1.0) -> np.ndarray:
# MGF M_X(t) = E[e^{tX}] for |t| < 1/s.
if s <= 0:
raise ValueError("scale s must be > 0")
t = np.asarray(t, dtype=float)
x = np.pi * s * t
out = np.full_like(t, np.nan, dtype=float)
ok = np.abs(t) < (1.0 / s)
ratio = np.empty_like(x)
small = np.abs(x) < 1e-4
ratio[small] = 1.0 + (x[small] ** 2) / 6.0 + 7.0 * (x[small] ** 4) / 360.0
ratio[~small] = x[~small] / np.sin(x[~small])
out[ok] = np.exp(mu * t[ok]) * ratio[ok]
return out
def logistic_cf(t: np.ndarray, mu: float = 0.0, s: float = 1.0) -> np.ndarray:
# Characteristic function φ_X(t) = E[e^{itX}] for real t.
if s <= 0:
raise ValueError("scale s must be > 0")
t = np.asarray(t, dtype=float)
x = np.pi * s * t
ratio = np.empty_like(x)
small = np.abs(x) < 1e-4
ratio[small] = 1.0 - (x[small] ** 2) / 6.0 + 7.0 * (x[small] ** 4) / 360.0
ratio[~small] = x[~small] / np.sinh(x[~small])
return np.exp(1j * mu * t) * ratio
4) Moments & Properties#
Let \(X\sim\mathrm{Logistic}(\mu,s)\).
4.1 Mean, variance, skewness, kurtosis#
Mean: \(\mathbb{E}[X] = \mu\).
Variance: \(\mathrm{Var}(X) = \dfrac{\pi^2 s^2}{3}\).
Skewness: \(0\) (symmetry).
Kurtosis: \(4.2\) (so excess kurtosis is \(6/5=1.2\)).
Also:
Median: \(\mu\).
Mode: \(\mu\).
4.2 MGF and characteristic function#
The MGF exists only on a strip around 0 (because tails are exponential):
The characteristic function exists for all real \(t\):
4.3 Differential entropy#
The logistic distribution has a simple differential entropy:
4.4 Tail behavior#
For large \(|x|\), the logistic density behaves like
so it has exponential tails (heavier than Gaussian, lighter than power-law tails).
# Quick numerical checks: moments + MGF (Monte Carlo)
mu0, s0 = 0.7, 1.3
n = 200_000
samples = logistic_rvs(rng, size=n, mu=mu0, s=s0)
mom = logistic_moments(mu=mu0, s=s0)
mean_mc = samples.mean()
var_mc = samples.var(ddof=0)
skew_mc = stats.skew(samples)
kurt_mc = stats.kurtosis(samples, fisher=False) # non-excess
mom, mean_mc, var_mc, skew_mc, kurt_mc
({'mean': 0.7,
'variance': 5.559877145947005,
'skewness': 0.0,
'kurtosis': 4.2,
'excess_kurtosis': 1.2,
'median': 0.7,
'mode': 0.7},
0.703193870225651,
5.502046897613281,
-0.004496053448897028,
4.170427457166342)
# MGF check for a few t in the valid range |t| < 1/s
# (Monte Carlo estimate: mean(exp(tX)))
ts = np.array([-0.4, -0.2, 0.2, 0.4]) / s0 # safely within (-1/s, 1/s)
mgf_theory = logistic_mgf(ts, mu=mu0, s=s0)
mgf_mc = np.array([np.mean(np.exp(t * samples)) for t in ts])
np.column_stack([ts, mgf_theory, mgf_mc])
array([[-0.3077, 1.0653, 1.0606],
[-0.1538, 0.9598, 0.9587],
[ 0.1538, 1.1905, 1.1902],
[ 0.3077, 1.6389, 1.6343]])
5) Parameter Interpretation#
5.1 Meaning of the parameters#
Location \(\mu\) shifts the distribution left/right.
mean = median = mode = \(\mu\)
Scale \(s\) stretches/compresses the distribution.
standard deviation: \(\sigma = \dfrac{\pi s}{\sqrt{3}}\)
interquartile range (IQR): $\(\mathrm{IQR} = F^{-1}(0.75)-F^{-1}(0.25)=2s\log 3.\)$
5.2 Shape changes#
Increasing \(s\) makes the density wider and the peak lower.
Decreasing \(s\) concentrates mass more tightly around \(\mu\).
Because this is a location–scale family, changing \((\mu,s)\) never changes the fundamental shape; it only shifts and rescales it.
# Useful scale relationships
def logistic_sd(s: float) -> float:
return float(np.pi * s / np.sqrt(3.0))
def logistic_iqr(s: float) -> float:
return float(2.0 * s * np.log(3.0))
for s in [0.5, 1.0, 2.0]:
print(f"s={s:>4}: sd={logistic_sd(s):.4f}, IQR={logistic_iqr(s):.4f}")
s= 0.5: sd=0.9069, IQR=1.0986
s= 1.0: sd=1.8138, IQR=2.1972
s= 2.0: sd=3.6276, IQR=4.3944
6) Derivations#
6.1 Expectation#
A very convenient representation comes from inverse-CDF sampling. If \(U\sim\mathrm{Unif}(0,1)\) then
So
But the integrand is antisymmetric around \(1/2\):
so the integral must be \(0\). Therefore \(\mathbb{E}[X]=\mu\).
6.2 MGF and variance#
Let \(Z\sim\mathrm{Logistic}(0,1)\) with CDF \(F(z)=\sigma(z)\). Use the substitution \(u=F(z)\). Because \(du=f(z)\,dz\), we get
This integral is finite only if \(t\in(-1,1)\). Using the reflection identity \(\Gamma(1+t)\Gamma(1-t)=\dfrac{\pi t}{\sin(\pi t)}\), we obtain
For a general location–scale transform \(X=\mu+sZ\),
To get the variance, expand around \(t=0\). Using
we get
So \(\mathbb{E}[X]=M_X'(0)=\mu\) and \(\mathbb{E}[X^2]=M_X''(0)=\mu^2+\dfrac{\pi^2 s^2}{3}\). Therefore
6.3 Likelihood (iid sample)#
For data \(x_1,\ldots,x_n\) i.i.d. from \(\mathrm{Logistic}(\mu,s)\),
The log-likelihood is
There is no closed-form MLE in general; it is typically found by numerical optimization.
def logistic_loglik(x: np.ndarray, mu: float, s: float) -> float:
return float(np.sum(logistic_logpdf(x, mu=mu, s=s)))
def fit_logistic_mle(x: np.ndarray, mu_init: float | None = None, s_init: float | None = None):
x = np.asarray(x, dtype=float)
if mu_init is None:
mu_init = float(np.median(x))
if s_init is None:
s_init = float(np.std(x, ddof=0) * np.sqrt(3.0) / np.pi)
s_init = max(s_init, 1e-3)
def nll(theta: np.ndarray) -> float:
mu, log_s = float(theta[0]), float(theta[1])
s = float(np.exp(log_s))
return -logistic_loglik(x, mu=mu, s=s)
res = optimize.minimize(nll, x0=np.array([mu_init, np.log(s_init)]), method="BFGS")
mu_hat, log_s_hat = res.x
return {
"mu_hat": float(mu_hat),
"s_hat": float(np.exp(log_s_hat)),
"success": bool(res.success),
"message": res.message,
"fun": float(res.fun),
}
# Compare our simple MLE to SciPy's fit on simulated data
x_data = logistic_rvs(rng, size=5_000, mu=1.2, s=0.8)
ours = fit_logistic_mle(x_data)
scipy_loc, scipy_scale = stats.logistic.fit(x_data)
ours, (scipy_loc, scipy_scale)
({'mu_hat': 1.2258002867931699,
's_hat': 0.7831215632335,
'success': True,
'message': 'Optimization terminated successfully.',
'fun': 8783.809900252894},
(1.2258003058433171, 0.7831215768119856))
7) Sampling & Simulation#
7.1 Inverse transform sampling#
Because the logistic CDF is invertible in closed form, we can sample using the inverse CDF.
If \(U\sim\mathrm{Unif}(0,1)\) and \(X=F^{-1}(U)\), then \(X\) has CDF \(F\). For logistic:
7.2 Practical notes#
When implementing \(\log\!\left(\frac{U}{1-U}\right)\) numerically, use $\(\log U - \log(1-U)\)$ with
log1pfor stability.Clip \(U\) away from exactly 0 and 1 to avoid returning \(\pm\infty\).
Algorithm (vectorized)
Draw \(u \leftarrow \mathrm{Uniform}(0,1)\)
Set \(u \leftarrow \mathrm{clip}(u,\varepsilon, 1-\varepsilon)\)
Return \(x \leftarrow \mu + s(\log u - \log(1-u))\)
# Sampling sanity checks
mu0, s0 = -0.5, 1.7
x = logistic_rvs(rng, size=200_000, mu=mu0, s=s0)
# 1) Mean/variance
print('mean (mc)', x.mean(), 'theory', logistic_moments(mu0, s0)['mean'])
print('var (mc)', x.var(ddof=0), 'theory', logistic_moments(mu0, s0)['variance'])
# 2) Probability integral transform: F(X) should look Uniform(0,1)
u = logistic_cdf(x, mu=mu0, s=s0)
print('u mean', u.mean(), 'u var', u.var(ddof=0))
# Compare a few quantiles to Uniform(0,1)
qs = np.array([0.01, 0.1, 0.5, 0.9, 0.99])
print('empirical u-quantiles:', np.quantile(u, qs))
print('target quantiles :', qs)
mean (mc) -0.5094692359972378 theory -0.5
var (mc) 9.4866561100536 theory 9.507718906382747
u mean 0.4992795266980339 u var 0.08335766686663254
empirical u-quantiles: [0.0099 0.0994 0.4993 0.8989 0.99 ]
target quantiles : [0.01 0.1 0.5 0.9 0.99]
8) Visualization#
We’ll visualize:
the theoretical PDF and CDF for several parameter choices
Monte Carlo samples from the NumPy-only sampler
# PDF/CDF for several parameter choices
params = [
(0.0, 0.6),
(0.0, 1.0),
(0.0, 2.0),
(2.0, 1.0),
]
# choose an x-range that covers all cases (0.001 to 0.999 quantiles)
lo = min(logistic_ppf(1e-3, mu=mu, s=s) for mu, s in params)
hi = max(logistic_ppf(1 - 1e-3, mu=mu, s=s) for mu, s in params)
xx = np.linspace(lo, hi, 800)
fig = make_subplots(rows=1, cols=2, subplot_titles=("PDF", "CDF"))
for mu, s in params:
label = f"μ={mu}, s={s}"
fig.add_trace(go.Scatter(x=xx, y=logistic_pdf(xx, mu=mu, s=s), mode="lines", name=label), row=1, col=1)
fig.add_trace(go.Scatter(x=xx, y=logistic_cdf(xx, mu=mu, s=s), mode="lines", showlegend=False), row=1, col=2)
fig.update_xaxes(title_text="x", row=1, col=1)
fig.update_xaxes(title_text="x", row=1, col=2)
fig.update_yaxes(title_text="density", row=1, col=1)
fig.update_yaxes(title_text="probability", row=1, col=2)
fig.update_layout(title="Logistic distribution: PDF and CDF", width=950, height=420)
fig.show()
# Monte Carlo histogram + PDF overlay
mu0, s0 = 0.0, 1.0
samples_mc = logistic_rvs(rng, size=80_000, mu=mu0, s=s0)
x_grid = np.linspace(logistic_ppf(1e-4, mu0, s0), logistic_ppf(1 - 1e-4, mu0, s0), 900)
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=samples_mc,
nbinsx=70,
histnorm="probability density",
name="Monte Carlo (NumPy-only)",
opacity=0.55,
)
)
fig.add_trace(
go.Scatter(
x=x_grid,
y=logistic_pdf(x_grid, mu=mu0, s=s0),
mode="lines",
name="True PDF",
line=dict(width=3),
)
)
fig.update_layout(title=f"Logistic(μ={mu0}, s={s0}): histogram vs PDF", width=900, height=420)
fig.show()
# CDF: theoretical vs empirical
x_grid = np.linspace(logistic_ppf(1e-4, mu0, s0), logistic_ppf(1 - 1e-4, mu0, s0), 700)
emp_x = np.sort(samples_mc)
emp_cdf = np.arange(1, emp_x.size + 1) / emp_x.size
fig = go.Figure()
fig.add_trace(go.Scatter(x=x_grid, y=logistic_cdf(x_grid, mu=mu0, s=s0), mode="lines", name="True CDF"))
fig.add_trace(
go.Scatter(
x=emp_x[::200],
y=emp_cdf[::200],
mode="markers",
name="Empirical CDF (subsampled)",
marker=dict(size=4, opacity=0.55),
)
)
fig.update_layout(title=f"Logistic(μ={mu0}, s={s0}): CDF vs empirical", width=900, height=420)
fig.show()
9) SciPy Integration (scipy.stats.logistic)#
SciPy parameterization:
stats.logistic(loc=mu, scale=s)
locis the location parameter \(\mu\).scaleis the scale parameter \(s>0\).
SciPy provides:
pdf,logpdf,cdf,ppfrvsfor samplingfitfor MLE
dist = stats.logistic(loc=mu0, scale=s0)
x_test = np.linspace(-2, 2, 5)
pdf = dist.pdf(x_test)
cdf = dist.cdf(x_test)
samples_scipy = dist.rvs(size=5, random_state=rng)
pdf, cdf, samples_scipy
(array([0.105 , 0.1966, 0.25 , 0.1966, 0.105 ]),
array([0.1192, 0.2689, 0.5 , 0.7311, 0.8808]),
array([-0.9978, 1.6964, -1.7808, -1.3917, -2.842 ]))
# MLE fit example
true_mu, true_s = 1.5, 0.9
x_fit = stats.logistic(loc=true_mu, scale=true_s).rvs(size=10_000, random_state=rng)
mu_hat, s_hat = stats.logistic.fit(x_fit) # returns (loc, scale)
true_mu, true_s, mu_hat, s_hat
(1.5, 0.9, 1.5159770568469728, 0.9025679317202033)
10) Statistical Use Cases#
10.1 Hypothesis testing (location)#
If you assume data are logistic with unknown \((\mu,s)\), a common hypothesis is
You can use a likelihood-ratio test (LRT):
where \((\hat\mu,\hat s)\) are the unrestricted MLEs and \(\tilde s\) is the MLE under \(H_0\).
10.2 Bayesian modeling#
Error model: logistic noise is a heavy-tailed alternative to Gaussian noise.
Latent-variable logistic regression: if \(Y=\mathbf{1}\{\eta+\varepsilon>0\}\) with \(\varepsilon\sim\mathrm{Logistic}(0,1)\), then $\(\Pr(Y=1\mid\eta)=\sigma(\eta).\)$ This gives the familiar logistic likelihood used in Bayesian logistic regression.
10.3 Generative modeling#
Inverse-CDF sampling makes logistic a convenient base distribution.
Mixtures of logistics can model multimodal or skewed densities and appear in modern neural generative models.
# 10.1 Likelihood-ratio test example: H0: mu = 0
rng_test = np.random.default_rng(123)
n = 400
mu_true, s_true = 0.35, 1.0
x = logistic_rvs(rng_test, size=n, mu=mu_true, s=s_true)
def mle_unrestricted(x: np.ndarray):
x = np.asarray(x, dtype=float)
def nll(theta: np.ndarray) -> float:
mu, log_s = float(theta[0]), float(theta[1])
s = float(np.exp(log_s))
return -logistic_loglik(x, mu=mu, s=s)
mu_init = float(np.median(x))
s_init = float(np.std(x, ddof=0) * np.sqrt(3.0) / np.pi)
res = optimize.minimize(nll, x0=np.array([mu_init, np.log(max(s_init, 1e-3))]), method="BFGS")
mu_hat, log_s_hat = res.x
return float(mu_hat), float(np.exp(log_s_hat)), float(-res.fun)
def mle_mu_fixed(x: np.ndarray, mu0: float):
x = np.asarray(x, dtype=float)
def nll(log_s: np.ndarray) -> float:
s = float(np.exp(float(log_s)))
return -logistic_loglik(x, mu=mu0, s=s)
s_init = float(np.std(x, ddof=0) * np.sqrt(3.0) / np.pi)
res = optimize.minimize(nll, x0=np.array([np.log(max(s_init, 1e-3))]), method="BFGS")
s_hat = float(np.exp(float(res.x)))
return s_hat, float(-res.fun)
mu0 = 0.0
mu_hat, s_hat, ll1 = mle_unrestricted(x)
s_tilde, ll0 = mle_mu_fixed(x, mu0=mu0)
lrt = 2.0 * (ll1 - ll0)
p_value = 1.0 - chi2.cdf(lrt, df=1)
{
"n": n,
"true": (mu_true, s_true),
"mle_unrestricted": (mu_hat, s_hat),
"mle_H0": (mu0, s_tilde),
"LRT": lrt,
"p_value": p_value,
}
/tmp/ipykernel_1031270/2293021024.py:30: DeprecationWarning:
Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
/tmp/ipykernel_1031270/2293021024.py:35: DeprecationWarning:
Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
{'n': 400,
'true': (0.35, 1.0),
'mle_unrestricted': (0.26463042446936813, 0.9882888911189841),
'mle_H0': (0.0, 0.9992910134947958),
'LRT': 9.410232061566376,
'p_value': 0.0021577791606112173}
# 10.2 Bayesian example: posterior over mu with known scale (grid approximation)
x = logistic_rvs(rng, size=200, mu=0.6, s=1.0)
s_known = 1.0
# Prior: mu ~ Normal(0, 2^2)
mu_grid = np.linspace(-2.5, 2.5, 1201)
log_prior = norm(loc=0.0, scale=2.0).logpdf(mu_grid)
# Log-likelihood for each mu on the grid
log_like = np.array([logistic_loglik(x, mu=mu, s=s_known) for mu in mu_grid])
log_post_unnorm = log_prior + log_like
log_post = log_post_unnorm - np.max(log_post_unnorm)
post = np.exp(log_post)
post /= post.sum()
post_mean = float(np.sum(mu_grid * post))
post_cdf = np.cumsum(post)
ci_low = float(mu_grid[np.searchsorted(post_cdf, 0.025)])
ci_high = float(mu_grid[np.searchsorted(post_cdf, 0.975)])
(post_mean, (ci_low, ci_high))
(0.5288318801511985, (0.2875000000000001, 0.7708333333333335))
# Visualize the posterior
fig = go.Figure()
fig.add_trace(go.Scatter(x=mu_grid, y=post, mode="lines", name="posterior"))
fig.add_vline(x=post_mean, line_dash="dash", line_color="black", annotation_text="posterior mean")
fig.add_vrect(x0=ci_low, x1=ci_high, fillcolor="gray", opacity=0.2, line_width=0)
fig.update_layout(
title="Posterior over μ (known s): grid approximation",
xaxis_title="μ",
yaxis_title="posterior density (discrete grid)",
width=900,
height=420,
)
fig.show()
# 10.3 Generative modeling: a simple mixture of logistics
weights = np.array([0.55, 0.45])
components = [(-1.2, 0.6), (1.4, 0.9)] # (mu, s)
def mixture_logistic_pdf(x: np.ndarray) -> np.ndarray:
x = np.asarray(x, dtype=float)
out = np.zeros_like(x)
for w, (mu, s) in zip(weights, components):
out += w * logistic_pdf(x, mu=mu, s=s)
return out
def mixture_logistic_rvs(rng: np.random.Generator, size: int) -> np.ndarray:
k = rng.choice(len(weights), size=size, p=weights)
out = np.empty(size, dtype=float)
for idx in range(len(weights)):
mask = k == idx
mu, s = components[idx]
out[mask] = logistic_rvs(rng, size=int(mask.sum()), mu=mu, s=s)
return out
mix_samples = mixture_logistic_rvs(rng, size=60_000)
x_grid = np.linspace(np.quantile(mix_samples, 0.001), np.quantile(mix_samples, 0.999), 900)
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=mix_samples,
nbinsx=90,
histnorm="probability density",
name="samples",
opacity=0.55,
)
)
fig.add_trace(go.Scatter(x=x_grid, y=mixture_logistic_pdf(x_grid), mode="lines", name="mixture PDF", line=dict(width=3)))
fig.update_layout(title="Mixture of logistics: histogram vs PDF", width=900, height=420)
fig.show()
11) Pitfalls#
Invalid scale: \(s\le 0\) is not a valid logistic distribution.
Overflow in naive formulas:
np.exp(-z)overflows if \(z\) is very negative.use stable forms (piecewise sigmoid,
logaddexp,log1p).
Sampling at the boundaries:
the inverse CDF uses \(\log\!\left(\frac{p}{1-p}\right)\); if \(p\) is exactly 0 or 1, you get \(\pm\infty\).
clip \(p\) (or the underlying uniform draws) away from {0,1}.
MGF domain:
\(M_X(t)\) exists only for \(|t|<1/s\).
Parameterization confusion:
some sources parameterize logistic by a “steepness” \(k=1/s\).
SciPy uses
(loc, scale).
Fitting:
for small samples, MLE can be noisy; prefer robust starting points (median + variance-based scale).
12) Summary#
logisticis a continuous distribution on \(\mathbb{R}\) with CDF equal to the sigmoid.Parameters: location \(\mu\in\mathbb{R}\) and scale \(s>0\) (a pure shift/scale family).
Key formulas:
\(\mathbb{E}[X]=\mu\),
\(\mathrm{Var}(X)=\pi^2 s^2/3\),
\(h(X)=\ln(s)+2\),
\(M_X(t)=e^{\mu t}\,\pi s t/\sin(\pi s t)\) for \(|t|<1/s\).
Sampling is easy via inverse CDF: \(\mu+s\log\!\left(\frac{U}{1-U}\right)\).
References
SciPy documentation:
scipy.stats.logistic.Reflection identity: \(\Gamma(z)\Gamma(1-z)=\pi/\sin(\pi z)\).
Mixture of logistics in neural generative modeling: PixelCNN++ (Salimans et al., 2017) uses discretized logistic mixtures.